AITopics | clustering sequence

Clustering Sequences with Hidden Markov Models

Neural Information Processing SystemsApr-6-2023, 18:08:38 GMT

This paper discusses a probabilistic model-based approach to clus(cid:173) tering sequences, using hidden Markov models (HMMs) . The prob(cid:173) lem can be framed as a generalization of the standard mixture model approach to clustering in feature space. Two primary issues are addressed. First, a novel parameter initialization procedure is proposed, and second, the more difficult problem of determining the number of clusters K, from the data, is investigated. Experi(cid:173) mental results indicate that the proposed techniques are useful for revealing hidden cluster structure in data sets of sequences.

cid, clustering sequence, hidden markov model

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Clustering sequence sets for motif discovery

Neural Information Processing SystemsApr-6-2023, 13:52:03 GMT

Most of existing methods for DNA motif discovery consider only a single set of sequences to find an over-represented motif. In contrast, we consider multiple sets of sequences where we group sets associated with the same motif into a cluster, assuming that each set involves a single motif. Clustering sets of sequences yields clusters of coherent motifs, improving signal-to-noise ratio or enabling us to identify multiple motifs. We present a probabilistic model for DNA motif discovery where we identify multiple motifs through searching for patterns which are shared across multiple sets of sequences. Our model infers cluster-indicating latent variables and learns motifs simultaneously, where these two tasks interact with each other.

motif, motif discovery, sequence, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.45)

Add feedback

Clustering sequence sets for motif discovery

Kim, Jong K., Choi, Seungjin

Neural Information Processing SystemsFeb-15-2020, 02:26:12 GMT

Most of existing methods for DNA motif discovery consider only a single set of sequences to find an over-represented motif. In contrast, we consider multiple sets of sequences where we group sets associated with the same motif into a cluster, assuming that each set involves a single motif. Clustering sets of sequences yields clusters of coherent motifs, improving signal-to-noise ratio or enabling us to identify multiple motifs. We present a probabilistic model for DNA motif discovery where we identify multiple motifs through searching for patterns which are shared across multiple sets of sequences. Our model infers cluster-indicating latent variables and learns motifs simultaneously, where these two tasks interact with each other.

motif, motif discovery, sequence, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.50)

Add feedback

Clustering Sequences with Hidden Markov Models

Smyth, Padhraic

Neural Information Processing SystemsDec-31-1997

This paper discusses a probabilistic model-based approach to clustering sequences, using hidden Markov models (HMMs). The problem can be framed as a generalization of the standard mixture model approach to clustering in feature space. Two primary issues are addressed. First, a novel parameter initialization procedure is proposed, and second, the more difficult problem of determining the number of clusters K, from the data, is investigated. Experimental results indicate that the proposed techniques are useful for revealing hidden cluster structure in data sets of sequences.

likelihood, matrix, sequence, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Orange County > Irvine (0.14)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Clustering Sequences with Hidden Markov Models

Smyth, Padhraic

Neural Information Processing SystemsDec-31-1997

This paper discusses a probabilistic model-based approach to clustering sequences, using hidden Markov models (HMMs). The problem can be framed as a generalization of the standard mixture model approach to clustering in feature space. Two primary issues are addressed. First, a novel parameter initialization procedure is proposed, and second, the more difficult problem of determining the number of clusters K, from the data, is investigated. Experimental results indicate that the proposed techniques are useful for revealing hidden cluster structure in data sets of sequences.

likelihood, matrix, sequence, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Orange County > Irvine (0.14)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Clustering Sequences with Hidden Markov Models

Smyth, Padhraic

Neural Information Processing SystemsDec-31-1997

This paper discusses a probabilistic model-based approach to clustering sequences,using hidden Markov models (HMMs) . The problem can be framed as a generalization of the standard mixture model approach to clustering in feature space. Two primary issues are addressed. First, a novel parameter initialization procedure is proposed, and second, the more difficult problem of determining the number of clusters K, from the data, is investigated. Experimental resultsindicate that the proposed techniques are useful for revealing hidden cluster structure in data sets of sequences.

artificial intelligence, machine learning, sequence, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California > Orange County > Irvine (0.14)

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback